home *** CD-ROM | disk | FTP | other *** search
Text File | 1997-07-30 | 58.8 KB | 1,311 lines |
- ===================================================
- COMP.TEXT.SGML - FREQUENTLY ASKED QUESTIONS(c) v1.1
- ===================================================
-
- UPDATED SECTIONS
- ================
-
- 07) Added/updated: Agfa SDMS, DocuBuild SGML, MacroTag, SGML/CALS
- Translator, SGML Hammer, (Shafftstall) SGML Translator, Silversmith,
- TABLETAG, TagWorX.
-
- 09) Added section on national/international Standards bodies.
-
- INTRODUCTION
- ============
-
- This article contains answers to questions that are frequently posted
- to comp.text.sgml, it is intended for newcomers to the list and
- SGML beginners.
-
- This FAQ is maintained on a voluntary basis, but any comments or
- additional information are welcome (see final section). It is a large
- file, and has to be split into two parts to guarantee safe transmission
- over the network (Part 1 contains sections 01-09, Part 2 contains
- sections 10-end).
-
- CONTENTS
- ========
-
- 01) What does "SGML" stand for? What is SGML?
- 02) What can SGML be used for?
- 03) What do I need to use SGML?
- 04) Books, Bibliographies, Newsletters and Journals
- 05) Newsgroups and discussion lists
- 06) Public Domain software
- 07) Commerical software
- 08) ftp archives
- 09) The SGML Users' Group, National Chapters, SIGs, Standards bodies
- 10) Conferences
- 11) SGML initiatives and major projects
- 12) SGML and other Standards
- 13) Introductory questions with answers - by Erik Naggum
- 14) Making comments/additions to this FAQ
-
- If any of the terms used in this FAQ are unfamiliar to you, consult
- the section on "Introductory questions with answers", the text of
- the SGML standard (see 01.1), or a good book on SGML (see below).
-
-
- 01) What does "SGML" stand for? What is SGML?
- ==============================================
-
- 01.1 SGML stands for the Standard Generalized Markup Language. SGML is
- defined in ISO 8879:1986 "Information Processing -- Text and Office
- Systems -- Standard Generalized Markup Language (SGML)". A copy of
- this document can be obtained from the International Organization for
- Standardization (ISO) or your national standard body. It is not
- available for ftp.
-
- 01.2 SGML enables the description of structured information _independent_
- of how that information is processed. It is a meta-language that provides
- a standard syntax for defining descriptions of classes of structured
- information; these descriptions are called document type definitions
- (DTDs). Information can be "marked up" according to a DTD, so that its
- structure is made explicit and accessible. The "markup" can be checked
- against a DTD to ensure that it is valid, and thus that the structure of
- the information conforms to that of the class described by the DTD.
- Ensuring that information is structured in a known way greatly facilitates
- any subsequent use of that information. For more information, beginners
- should read Erik Naggum's "Introductory questions with answers"
- (in this FAQ), consult ISO 8879 and/or a book on SGML (also in this FAQ).
-
- 01.3 DTDs define the rules to structure information but do _not_ say how that
- information should be processed. Therefore, SGML and DTDs do not deal
- with how, say, a document should be processed for formatting on paper
- (via LaTeX), display on-line (via Hypercard) , and mapping into a document
- database (via Oracle) -- but, having made the structure of the document
- explicit, enables all these subsequent processes to use exactly
- the same source document. SGML is _not_ a replacement for defacto
- standards such as TeX or PostScript.
-
- 01.4 SGML is non-proprietary. Publication and amendments to the International
- Standard are controlled solely by the ISO.
-
- 01.5 Use of SGML is not confined to any particular make or type of
- computer or software. SGML-aware products are available for most types
- of machine.
-
- 01.6 It is not user-unfriendly. What SGML is, what it looks like "in the
- raw", and how other software is able to make use of SGML markup, will be
- of little concern to most users. Sophisticated packages now exist to
- create, edit, and manipulate information that has been marked up with SGML,
- e.g. quasi-WYSIWYG editors for creating SGML documents that conform to
- any given valid DTD.
-
-
- 02) What can SGML be used for?
- ==============================
-
- 02.1 Its uses are many and diverse. SGML DTDs (see 01.2) define the markup
- and markup rules that can be used for a given class of documents (where
- "document" is a file of information). A DTD is usually written with
- some kind of end processing in mind - but since SGML markup is application
- independent, it means that documents that conform to a particular DTD
- can be re-used in a variety of different ways.
-
- 02.2 For example, a document designer might write a DTD that enables the
- abstract of a scientific paper to be marked up as such. The primary
- purpose is that the text identified as forming part of a paper's
- abstract can then be formatted in a particular way when the SGML source
- document is translated into a file usable by a text processing system.
- If, at some later date, it is decided that abstracts should be formatted
- in a different way, it is only necessary to alter the translation program
- and not every instance of an abstract in every paper that has to be
- (re-)printed out. Moreover, knowing that in every document conforming
- to the particular DTD, abstracts will be identified as such, it is a
- trivial matter to combine papers supplied by several authors into a
- collection that has a uniform physical appearance, or to produce a
- catalogue of abstracts for publication or inclusion in a database.
-
- 02.3 If you want to know whether SGML is appropriate for a particular task,
- consult the current discussion lists, journals, and Special Interest
- Groups (SIGs), and/or post to a newsgroup such as comp.text.sgml. People
- are always willing and eager to hear about new ways that SGML might be
- used.
-
-
-
- 03) What do I need to use SGML?
- ===============================
-
- 03.1 In order to use SGML you will need an SGML parser (that conforms to
- ISO 8879), an entity manager, an editor to produce your DTDs and/or
- SGML documents, and probably some sort of translation program to convert
- your SGML documents into a form suitable for some specific processing.
- If you are planning to convert existing information into SGML documents,
- you will need some sort of "retro-tagging" or auto-conversion software.
-
- 03.2 ISO 8879 contains very precise definitions of terms such as
- "SGML system", "SGML application", "SGML parser", "entity manager" and
- so on. Users are advised to consult the text of ISO 8879 carefully, as
- mis-use of terms defined in the Standard can lead to misunderstandings.
-
- 03.3 The following definitions are taken from ISO 8879, but readers are
- advised to consult the full text:
-
- 4.279 SGML application: Rules that apply SGML to a text
- processing application. An SGML application includes a formal
- specification of the markup constructs used in the application,
- expressed in SGML. It can also include a non-SGML definition of
- semantics, application conventions, and/or processing.
-
- 4.287 SGML system: A system that includes an SGML parser, an
- entity manager, and both or either of:
- a) an implementation of one or more SGML applications; and/or
- b) facilities for a user to implement SGML applications, with
- access to the SGML parser and entity manager.
-
- 4.285 SGML parser: A program (or portion of a program or a
- combination of programs) that recognizes markup in SGML
- documents.
-
- 4.123 entity manager: A program (or portion of a program or a
- combination of programs), such as a file system or symbol table,
- that can maintain and provide access to multiple entities.
-
- 4.120 entity: A collection of characters that can be referenced as
- a unit.
-
- NB. In note (b) of the "Scope" section of ISO 8879:1986, it states
- that the Standard does NOT "Specify the implementation, architecture,
- or markup error handling of conforming systems". In the glossary
- to his book (see below) Eric Van Herwijnen defines "SGML
- implementation: A collection of SGML application procedures that....
- provide the mapping from the structure defined by a given SGML
- application to a concrete system such as a textformatter or a
- database."
-
- Note: Beginners may not find any of these definitions enlightening. Be
- aware that some posters use the terminology of ISO 8879 very rigorously,
- whilst others are more lax. This opens the door for misunderstandings.
- Unless you are sure that you are using terminology in the correct way, as
- taken from ISO 8879, please try to be as explicit and unambiguous as possible.
-
- 03.4) Put simply, most users will obtain or write a DTD (see above). If
- you write your own DTD, you will need to validate it using an SGML parser
- that conforms to ISO 8879. To create SGML documents which conform to
- a DTD you will need an editor and a parser. The editor is used to input
- information and insert SGML markup into the document; the parser is used
- to check that the markup and the way it has been used conform to the
- rules given in the DTD. Many commercial packages offer syntax-directed
- editors, which interactively ensure that any editing and markup operations
- conform to the rules of the DTD.
-
- 03.5) Once you have a valid SGML document that conforms to a valid SGML
- DTD, you may want to do some subsequent processing. For example, in
- order to get paper output, you will need a program (or set of programs),
- that can read your SGML document and produce a file acceptable to your
- word processing package/text formatter. With a well-known publicly
- available DTD, it may be possible to obtain a translation package that
- has already been written; otherwise, you will need to write any translations
- required for subsequent processing yourself.
-
- 03.6) Translating existing information into valid SGML documents can be
- more problematic. SGML is good at handling structured information. You
- will need to obtain/write a DTD which is suitable for representing the
- structure (in full or in part) of your existing information. You will then
- need to obtain/write translations which can take your existing information
- and output it in an appropriate form that includes SGML markup which can be
- validated against your chosen DTD. If your existing information already
- contains an unambiguous structure which is clearly indicated, it should
- be possible to convert this information into conforming SGML. If your
- existing information is not clearly structured, or that structure is
- ambiguous, conversion to SGML is much more hard work. Complex information
- structures will also involve much more effort to translate into conforming
- SGML.
-
- 03.7) Always look out for existing DTDs/translation packages which may
- meet your needs. If you write a DTD or means of translating to/from
- SGML, consider sharing it with the rest of the SGML user community
- (post it to a newsgroup or ftp site). This is a good way for all of us
- to have access to well-written, tried and tested DTDs etc.
-
-
- 04) Books, Bibliographies, Newsletters and Journals
- ===================================================
-
- This list does not pretend to be complete, nor does it offer any value
- judgements about any of the products listed. Items in each category
- are given in alphabetical order by author/title.
-
- Some items are available at discounted rates to members of the SGML Users'
- Group or the Graphics Communication Association (GCA) - see below.
-
- Note: Robin Cover's on-line bibliography contains many more references,
- and much more information on each text.
-
-
- 04.1 Books:
-
- BRYAN, Martin "SGML: an author's guide to the standard generalized markup
- language". Wokingham/Reading/New York: Addison-Wesley. 1988. 380 pages.
- ISBN: 0-201-17535-5 (pbk).
-
- GOLDFARB, Charles "The SGML Handbook". Oxford: Oxford University Press. 1990.
- 688 pages. ISBN: 0-19-853737-9 (hbk).
-
- HERWIJNEN, Eric van "Practical SGML". Dordrecht/Boston/London: Kluwer
- Academic Publishers. 1990. 307 pages. ISBN 0-7923-0635-X (pbk).
-
- SMITH, Joan & STUTELY, Robert "SGML:the users' guide to ISO 8879". New York/
- Chichester/Brisbane/Toronto: Ellis Horwood Limited/Halstead Press. 1988.
- 173 pages. ISBN 0-7458-0221-4 (Ellis Horwood Limited) (hbk).
- ISBN 0-470-21126-1 (Halstead Press)(hbk).
-
- SOFTQUAD Inc. "The SGML Primer". Toronto: SoftQuad Inc. Private printing,
- available from SoftQuad Inc.
-
-
- 04.2 Bibliographies:
-
- COVER, Robin & DUNCAN, Nicholas & BARNARD, David "BIBLIOGRAPHY ON SGML
- (Standard Generalized Markup Language) AND RELATED ISSUES Technical
- Report 91-299). Ontario: Queen's University at Kingston. 1991. 312 pages.
- ISSN 0836-0227. 1991 Cost $21.00 (Canadian). Contact: Doug Hamilton,
- Dept. of Computing & Information Science. Goodwin Hall, Queen's University,
- Kingston, Ontario, CANADA K7L 3N6. Phone: (1-613) 545-6056. Email
- (Internet): hamilton@qucis.queensu.ca
-
- COVER, Robin "STANDARD GENERALIZED MARKUP LANGUAGE, ISO 8879:1986 (SGML)
- ANNOTATED BIBLIOGRAPHY AND LIST OF RESOURCES" version 2.0 Revised
- January 1992. (c) Robin Cover. Available on-line from many ftp sites.
- Updates posted to comp.text.sgml etc. Contact: Robin Cover 6634 Sarah
- Drive, Dallas, TX 75236 USA. Phone: (214) 296-1783. Fax: (214) 709-3387.
- Email (Internet): robin@utafll.uta.edu.
-
-
- 04.3 Newsletters & Journals:
-
- Note: There are several journals dedicated to CALS - see Robin Cover's
- on-line Bibliography or contact the CALS SIG of the International SGML
- Users' Group for details.
-
- "EPSIG News" - A quarterly publication of information relating to the
- ANSI/NISO manuscript standard Z39.59-1988 (also known as the "AAP"
- standard). Avaiable through EPSIG (address below). ISSN 1042-3737.
-
- "SGML Users' Group Newsletter" - An occasional publication of news, events,
- product announcements and short articles available through the
- International SGML Users' Group (address below). ISSN 0952-8008
-
- "SGML Users' Group Bulletin" - Longer/more technical papers than appear
- in the SGML Users' Group Newsletter. Available through the International
- SGML Users' Group (address below). ISSN 0269-2538.
-
- "SGML SIGhyper Newsletter" - An occasional publication of the SGML Users'
- Group Special Interest Group on Hypertext and Multimedia (SIGhyper).
- Available through SGML SIGhyper (address below).
-
- "<TAG> The SGML Newsletter" - Managing Editor: Brian Travis (Internet email:
- brian@sgmlinc.com). 12 issues per year. Contact: Graphic
- Communications Association. Phone: +1 703-519-8157. Fax: +1 703-548-2867.
-
-
-
- 05) Newsgroups and discussion lists
- ===================================
-
- 05.1 Newsgroups
-
- comp.text.sgml - this usenet newsgroup. The main electronic forum for
- discussion of SGML and closely related matters. Begun in late 1990.
- All postings are archived at the ftp site maintained at Oslo University
- in Norway (searching via WAIS and gopher is possible).
-
- sgml@ifi.uio.no - electronic mailing list that echoes all postings to
- comp.text.sgml for those who have difficulties with usenet news. To
- subscribe:
- mail: sgml@ifi.uio.no
- subject: subscribe comp.text.sgml
- body: (blank)
-
-
- 05.2 Discussion lists
-
- sgml-l - electronic mailing list for discussion of SGML issues. Many
- articles posted to comp.text.sgml are echoed to this list. To subscribe:
- mail: listserv@dhdurz1
- subject: (blank)
- body: SUB SGML-L Michael Popham <NB. use your full name>
- SIGNUP
-
- sgml-math - electronic mailing list for discussion of issues relating to the
- handling of math under the AAP Standard. DTD fragments are circulated for
- comment. To subscribe:
- mail: listerv@e-math.ams.com
- subject: (blank)
- body: subscribe sgml-math Michael Popham <NB. use your full name>
- set sgml-math mail ack
- help
-
- sgml-tables - electronic mailing list for discussion of issues relating to the
- handling of tables under the AAP Standard. DTD fragments are circulated for
- comment. To subscribe:
- mail: listerv@e-math.ams.com
- subject: (blank)
- body: subscribe sgml-tables Michael Popham <NB. use your full name>
- set sgml-tables mail ack
- help
-
-
- tei-l - electronic mailing list for discussion/information relating to
- the work of the Text Encoding Initative (TEI). To subscribe:
- mail: listserv@uicvm
- subject: (blank)
- body: SUB SGML-L Michael Popham <NB. use your full name>
- SIGNUP
-
-
- 06) Public Domain software
- ==========================
-
- Note: Public domain products are available from most of the anonymous
- ftp archives. (The full addresses of many of the ftp archives is given
- in Robin Cover's on-line Bibliography, or you could search available
- archives using ARCHIE). Older public domain products are also available
- from some ftp sites, but are not listed here.
-
- ARC-SGML
- A set of SGML Parser Materials, produced by Dr Charles Goldfarb
- and made available through the SGML Users' Group. Contains source code
- which can be used to build your own programs to handle SGML; also
- contains a sample application called vm2. Copies on disk are
- available through the GCA, SGML SIGhyper, and The SGML Project at the
- University of Exeter. The orginal source code was written in C to
- run on IBM compatible PCs under DOS. The original files and ports to many
- operating systems and platforms (e.g UNIX, Mac) are available for ftp.
- (When searching ftp archives, look for directories/files with names like
- "arcsgml" or "ARC-SGML").
-
- ICA (Integrated Chameleon Architecture)
- A code generating software architecture for producing translators between
- different representations of electronic data. ICA is not SGML-specific.
- Runs under UNIX, using X Windows (R4, R5). The ICA Project is based at
- Ohio State University, and all new releases come from there. Available
- for ftp from archive.cis.ohio-state.edu, under the directory pub/chameleon.
- (The accompanying PostScript file of documentation runs to 186 pages).
- Contact: Peter Ware <ware@edu.ohio-state.cis>
-
- qwertz/FORMAT
- An SGML to LaTeX and nroff/troff translator produced by
- the Qwertz Project at the German National Centre for Computer Science.
- The LaTeX document styles have been re-written as an SGML DTD (the
- qwertz DTD). SGML documents can be created, and quickly mapped into
- a format suitable for processing by a LaTeX, nroff/troff formatter. New
- releases are announced on comp.text.sgml. Available for ftp. The original
- code is available for ftp from gmdzi.gmd.de [129.26.1.90] under the
- directory /pub/gmd (get "sgml2latex-format.readme" and "sgml2latex-format.
- tar.Z")
-
- sgmls
- An SGML parser derived from the ARC-SGML Parser Materials, written
- by James Clark. sgmls outputs a simple, line-oriented, ASCII representation
- of an SGML document's Element Structure Information set which can be
- easily parsed by awk, perl, C or whatever. The idea is that sgmls can be
- used as the front end for a structure-controlled SGML application. New
- releases are announced on comp.text.sgml. sgmls consists of C source
- code intended to run under UNIX, but with instructions for porting/compiling
- under DOS. Available for ftp. (look for directories/files with names
- like "sgmls", "jclark", "sgmls-0.8.tar.Z").
-
-
- 07) Commerical software
- =======================
- This list is not complete. Omission from this list is through accident
- or ignorance. A list of products is given, followed by a list of contact
- names and addresses.
-
- Note: NO VALUE OR "FITNESS FOR PURPOSE" JUDGEMENT IS PLACED ON ANY
- PRODUCT OR SERVICE LISTED. ALWAYS CHECK WITH THE SUPPLIER TO ENSURE
- THAT ANY PRODUCT (OR COMBINATION OF PRODUCTS) WILL DO WHAT YOU WANT,
- AND WILL WORK WITH YOUR COMPUTER/OPERATING SYSTEM.
-
- Note: "quasi-WYSIWYG" refers to the capablity of some packages to format
- the screen/paper output of an SGML document, such that the two output
- representations are similar.
-
-
- 07.1) Product list
-
- Agfa CAPS - for large-scale publishing. Contact: local Agfa Gevaert office.
-
- Agfa SDMS - SGML-based document management system. Contact: local Agfa
- Gevaert office.
-
- Author/Editor - create/edit/validate SGML DTDs and documents (quasi-WYSIWYG).
- Contact: SoftQuad Inc.
-
- BASISplus - SGML-aware database system. Contact: Information Dimensions Inc.
-
- DocuBuild SGML - for large-scale (CALS) publishing on DEC VAX.
- Contact: Xerox Corporation.
-
- DynaText - on-line SGML document indexing/searching/browsing. Contact:
- Electronic Book Technologies
-
- EASE - create/edit/validate SGML DTDs and documents. Contact: E2S
-
- FastTAG - conversion/auto-tagging for scanned hardcopy and electronic
- text files. Contact: Avalanche Development Company
-
- FrameMaker - DTP with some (CALS?) SGML capabilities. Contact: Frame
- Technology Corporation
-
- Grif - create/edit/validate SGML DTDs and documents (quasi-WYSISWYG), inc.
- graphics. Contact: Grif S.A.
-
- Guide - Guide hypertext from SGML documents. Contact: Office Workstations
- Limited (OWL)
-
- HyMinder - HyTime engine (in production). Contact: TechnoTeacher, Inc.
-
- Interleaf - DTP with some (CALS?) SGML capabilities. Contact: Interleaf
-
- MacroTag - supports macros for using SGML with MS-Word 4.0 or
- WordPerfect 5.0. Contact: Allen Creek Software.
-
- Mark-It - validates SGML DTDs and documents. Contact: Sema Group Systems
- Limited
-
- SGML/CALS Translator - conversion/auto-tagging. Contact: Shafftstall
- Corporation.
-
- SGML-DB - an SGML-aware database system. Contact: A.I.S/Berger Levrault
-
- SGML Hammer - conversion/auto-tagging for SGML documents.
- Contact: Avalanche Development Company
-
- SGML Publisher - create/edit/validate SGML DTDs and documents (quasi-
- WYSIWYG), inc. graphics. Contact: AborText Inc.
-
- SGML/Search - an SGML-aware database system (being replaced by SGML-DB)
- Contact: A.I.S/Berger Levrault
-
- SGML Translator - validates SGML documents, translates SGML documents
- to DCF also BookMaster, BookManager. Contact: IBM
-
- SGML Translator - conversion/auto-tagging. Contact: Shafftstall Corporation.
-
- SGML Toolchest - a set of tools to aid the production of SGML documents on
- DEC/VAX machines. Contact: DEC
-
- Silversmith - full-text retrieval system for SGML documents. Contact:
- Taunton Engineering.
-
- TABLETAG - conversion/auto-tagging for Lotus 1-2-3 spreadsheets to
- CALS, AAP and Author/Editor tables. Contact: The Unifilt Company.
-
- TagWorX - conversion/auto-tagging of scanned documents (within ScanWorX).
- Contact: Xerox Imaging Systems.
-
- TextWrite - create/edit SGML documents. Contact: IBM
-
- TextWrite Tools - create/edit SGML DTDS. Contact: IBM
-
- WordPerfect Markup - converts between (UNIX) WordPerfect 5.1 and SGML
- documents. (In testing until late 1992; release due early 1993?).
- Contact: WordPerfect
-
- Write-It - create/edit SGML documents. Contact: Sema Group Systems Limited
-
- WriterStation - create/edit SGML documents. Contact: Datalogics Inc.
-
- XGML Omnimark - conversion/auto-tagging. Contact: Exoterica Corporation
-
- XGML Validator - validates SGML DTDs and documents. Contact: Exoterica
- Corporation
-
-
- 07.2 Contact list
-
- A.I.S/Berger Levrault - 34 Av. du Roule, 92200 Neuilly, France;
- Phone: +33-1-46-40-10-60: Fax: +33-1-46-40-18-44.
-
- AborText Inc.- 533 West William Street, Suite 300, Ann Arbor, MI 48103,
- USA; Phone: +1-313-996-3566; Fax: +1-313-996-3573;
- Email: sales@arbortext.com (Internet)
-
- Agfa Gevaert - (Your local Agfa Gevaert office/supplier).
-
- Allen Creek Software - Carol Kamm, 1209 West Huron, Ann Arbor, MI 48103,
- USA; Phone: +1-313-663-4248.
-
- Avalanche Development Company - Eileen Quirk, Director of Marketing and Sales,
- Avalanche Development Company, 947 Walnut Street, Boulder, CO 80302,
- USA; Phone:+1-303-449-5032; Fax: +1-303-449-3246;
- Email: sales@avalanche.com (Internet)
-
- Datalogics Inc. - 441 West Huron Street Chicago, Illinois 60610, USA;
- Phone: +1-312-266-4444.
-
- DEC - (Your local Digital Equipment Corporation (DEC) supplier)
-
- E2S - Ronny Verkest, Sales Manager, E2S, Moutstraat 100, B-9000 Gent,
- Belgium; Phone: +32(91)-21-03-83; Fax: +32(91)-20-31-91;
- Email: e2s@e2s.be (Internet)
-
- Electronic Book Technologies - One Richmond Square, Providence,
- RI 02906, USA; Phone: +1-401-421-9550; Fax: +1-401-421-9551
-
- Exoterica Corporation - 1545 Carling Ave., Suite 404, Ottawa, Ontario,
- Canada K1Z 8P9; Tel: 613-722-1700 (or 1-800-565-XGML
- for product information); Fax: 613-722-5706.
- Email: info@xgml.com (enquiries) (Internet)
-
- Frame Technology Corporation - 2911 Zanker Road, San Jose, CA 95134,
- USA; Phone: +1-408-433-3311
-
- Grif S.A. - 2 Bd Vauban BP 266, 78053 St-Quentin-en-Yvelines, Cedex,
- France; Phone: +33-1-30-60-75-10; Fax: +33-1-30-60-75-27
-
- IBM - (Your local IBM office)
-
- Information Dimensions Inc. - 5080 Tuttle Crossing Boulevard, Dublin,
- Ohio 43017-3569, USA; Phone: 1-800-DATA-MGT: Fax: 614-761-7290
-
- Interleaf - (Your nearest Interleaf supplier)
-
- Office Workstations Limited (OWL) - Rosebank House, 144 Broughton Road,
- Edinburgh EH7 4LE, UK; Phone: +44-31-557-5720; Fax: +44-31-557-5721.
-
- Sema Group Systems Limited - Martin Bryan, Sema Group Systems Limited,
- Avonbridge House, Bath Road, Chippenham, Wiltshire SN15 2BB, UK;
- Phone: +44-249-656194; Fax: +44-249-655723
-
- Shafftstall Corporation - Anthony L. Shaffstall, VP Sales, 7901 East 88th
- Street, Indianapolis, IN 46256-1235, USA; Phone: +1-317-842-2077
-
- SoftQuad Inc. - 56 Aberfoyle Crescent, Suite 810, Toronto, CANADA, M8X 2W4;
- Phone: +1-416-239-4801; Fax: +1-416-239-7105; Email: mail@sq.com
- (Internet)
-
- Tauton Engineering - John Bottoms, Tauton Engineering Inc., 26 Westvale Road,
- Condord, MA 01742-2935, USA.
-
- TechnoTeacher, Inc.- Steve Newcomb, TechnoTeacher, Inc., 1810 High Road,
- Tallahassee, FL 32303-4408, USA; Phone: +1-904-422-3574;
- Fax: +1-904-386-2562.
-
- The Unifilt Company - Michael Kless (President), PO Box 2528, Edison,
- NJ 08817, USA. Phone: +1-908-225-2243; Fax: +1-908-225-2248.
-
- WordPerfect - (Your local/national WordPerfect Corporation office)
-
- Xerox Corporation - Publishing Marketing Manager, 10200 Willow Creek
- Road, San Diego, CA 92131, USA. Phone: +1-619-695-7789;
- Fax: +1-619-695-7710.
-
- Xerox Imaging Systems - 9 Centennial Drive, Peabody, MA 01960, USA.
- Phone: +1-508-977-2000; Fax: +1-508-977-2148.
- (European office): Unit 8, Suttons Business Park, Reading,
- RG6 1AZ, UK. Phone:+44-734-668421; Fax: +44-734-261913.
-
-
-
-
- 08) ftp archives
- ================
-
- Many ftp archives now hold information on SGML and copies of public
- domain software, DTDs etc. (see Robin Cover's on-line Bibliography, or
- use ARCHIE to find your nearest ftp site). The main archives are:
-
- ftp.ifi.uio.no [129.240.88.1] - University of Oslo, Norway
-
- mailer.cc.fsu.edu [128.186.6.103] - Florida State University, USA
-
- sgml1.ex.ac.uk [144.173.6.61] - Exeter University, UK
-
- The ftp archive at Oslo is very well maintained. Some national archives
- mirror the contents of one or more of those mentioned above.
-
- Please try to be considerate when using ftp - it is a privilege not a right.
- The setting up and maintenance of most archives is done on a voluntary
- basis, using resources that are loaned by the site administrators.
-
-
- 09) The SGML Users' Group, National Chapters, SIGs
- ==================================================
-
- The International SGML Users' Group was set up to promote the use of
- SGML and represent the interests of SGML users on various international
- bodies.
-
- Membership of the International SGML Users' Group (or an affiliated
- National Chapter or SIG), entitles you to receive the Users' Group Newsletter
- and Bulletin, and discounts on various books and conferences.
-
- For membership details, contact:
-
- Mr Stephen G Downie
- SoftQuad Inc.
- 56 Aberfoyle Crescent
- Suite 810
- Toronto, Ontario M8X 2W4
- Canada
-
- Phone: +1-416-239-4801
- Fax: +1-416-239-7105
-
- Activities and costs of joining a National Chapter or SIG varies greatly.
- Please contact the appropriate person (see below) for more details. Other
- CALS SIGs may exist, but I do not have any information about them; I will
- list them if and when details are supplied.
-
- 09.1 National Chapters are listed below by country (with contact names):
-
- Australia - (Just setting up. Contact: Nick Carr, PO Box R806, Sydney NSW,
- Australia 2000; Phone: 612-262-4777; Fax 612-262-4774)
-
- Canada - Dr Martin Levy (Chairman), Senior Director Regulatory Affairs,
- Vice President Scientific Affairs, Fujisawa Pharmaceutical
- Company, 7181 Woodbine Avenue, Suite 110, Markham, Ontario L3R 1A3
- Canada; Phone: +1-416-470-7990; Fax: +1-416-470-7799.
-
- France - (Just setting up. Watch this space!)
-
- Germany - Dr Manfred Kruger, MID/Information Logistics Group GmbH,
- Ringstrasse 15, 6900 Heidelberg, West Germany; Phone: +49-6221-
- 166-091; Fax: +49-6221-23921.
-
- Japan - Mr Makoto Yoshioka, Research Fellow, Personal Systems Division,
- Fujistu Laboratories Ltd, 1015, Kamikodanaka Nakahara-Ku,
- Kawasaki 211, Japan; Phone: +81-44-754-2690; Fax:+81-44-754-2594.
-
- Netherlands - Mr Jan Maasdam, Samsom Uitgeverij, Postbus 4, 2400 MA
- Alphen aan de Rijn, The Netherlands; Phone: +31-1720-66-612.
-
- New Zealand - (See Australia)
-
- Norway - Mr Jon Urdal, Fabritius A/S, Brobekkvn. 80, 0583 Oslo 5, Norway.
- Phone: +47-2-636400; Fax: +47-2-636590; Email: ju@gi.no (Internet)
-
- South East Asia - (See Australia)
-
- Switzerland - Mr Jurgen De Jonghe, AS Division, CERN, Geneva 23,
- Switzerland; Phone: +41-22-767-81-41; Fax: +41-22-782-47-20.
-
- UK - Mr Nigel Bray, Database Publishing Systems Ltd., 608 Delta Business
- Park, Great Western Way, Swindon, Wiltshire SN5 7XF, UK;
- Phone:+44-793-512-515; Fax: +44-793-512-516.
-
- USA (Colorado) - (Just setting up. Contact Brian Travis, Editor of <TAG>
- who may have more information)
-
- USA (New York) - Mr W Joseph Davidson, SGML Forum of New York, Bowling
- Green Station, P.O. Box 803, New York, NY 10274-0803, USA;
- Phone: +1-212-691-4463; Fax: +1-212-691-1821.
-
- USA (Washington/Midwest) - Ms Beth Micksch, Datalogics Inc., 441 West Huron
- Chicago, IL 60611, USA; Phone: +1-312-266-3131; Fax: +1-312
- -266-4473; Email: bem@dlogics.com (Internet)
-
-
- 09.2 Special Interest Groups (SIGs):
-
- ATA SIG - Ms Dianne Kennedy, Datalogics Inc., 441 West Huron, Chicago,
- IL 60611, USA; Phone: +1-312-266-4483; Fax: +1-312-266-4473;
- Email: dkv@dlogics.com (Internet)
-
- CALS in Europe SIG - David Ardron, Secretary, CALS in Europe SIG, Ferranti
- Computer Systems Ltd., Western Road, Bracknell, Berkshire RG12 1RA,
- UK; Phone: +44-344-483232; Fax: +44-344-54639
-
- Database SIG - Mr Hans Mabelis, c/o Matrices Software, Westeinde 14,
- 1017 ZP Amsterdam, The Netherlands; Phone: +31-20-25-50-06;
- Fax: +31-20-24-79-48.
-
- European Workgroup on SGML (EWS) - Mr Holger Wendt, Springer-Verlag GmbH & Co.
- KG, Postfach 105280, Tiergartenstrasse 17 6900, Heidelberg 1,
- Germany; Phone: +49-6221-487-324; Fax: +49-6221-43982.
-
- SGML SIGhyper (The SGML Users' Group SIG on Hypertext and Multimedia) -
- Mr Steven R Newcomb, TechnoTeacher, Inc., 1810 High Road,
- Tallahassee, FL 32303-4408, USA; Phone: +1-904-422-3574;
- Fax: +1-904-386-2562.
-
-
-
- 09.2 Standards bodies:
-
- American National Standards Institute (ANSI) - 1430 Broadway, New York,
- NY 10018, USA. Phone: +1-212-642-4995.
-
- British Standards Institution (BSI) - Linford Wood, Milton Keynes, MK14, UK.
-
- International Organization for Standardization (ISO) - 1 Rue de Varembe,
- Case Postale 56, CH-1211 Geneva 20, Switzerland.
-
- 10) Conferences
- ===============
-
- The major SGML-related conferences are organized by the Graphics
- Communications Association (GCA), at reduced rates for GCA members. Anyone
- can attend, and for SGML`92 the GCA have introduced a discount rate for
- representatives from academic institutions. To contact the GCA:
-
- Graphic Communications Association
- 100 Daingerfield Road, 4th Fl.
- Alexandria
- VA 22314-2888
- Phone: (703)-519-8157
- Fax: (703)-548-2867
-
-
- "International Markup" - Held annually since 1982 in Europe. General/all
- SGML (and closely related issues). Next: 9-12 May 1993, Rotterdam.
-
- "SGML" - Held annually in USA. General/all/specialist SGML (and closely
- related issues). Generally more technical that "International Markup".
- Next: 25-30 October 1992, Danvers (MA).
-
-
-
- 11) SGML Initiatives and major projects
- =======================================
-
- This section contains details on:
-
- AAP / EPSIG
- EWS
- TEI
-
- 11.1) AAP / EPSIG - Association of American Publishers, Electronic
- Publishing Special Interest Group. Responsible for producing, maintaining
- and updating ANSI/NISO Z39.59-1988 (also know as the "AAP Standard"). The
- AAP standard consists of a set of DTDs relating to the preparation and
- markup of electronic manuscripts. The standard is currently undergoing
- review, and specialist workgroups of interested volunteers are working on
- DTDs for handling tables and mathematics (using two electronic discussion
- lists - see the appropriate section, above). For information, contact:
-
- Betsy Kiser (EPSIG Manager)
- EPSIG
- c/o OCLC
- 6565 Frantz Road
- Dublin
- Ohio 43017-0702
- USA
-
- Phone: (614)-764-6195
- Fax: (614)-764-6096
-
-
- 11.2) EWS - European Workgroup on SGML. A collection of major European
- publishers, typesetters, printers (and other interested parties), working
- toward producing a DTD (or set of DTDs), suitable for publishing scientific
- journals, papers etc. Initally based-on, and still closely watching, work
- on the AAP Standard. EWS DTD(s) currently known as the "MAJOUR DTD" - a
- complete version of which is due out at the end of 1992 (a copy of the
- header part of the DTD and accompanying handbook was distributed at
- "International Markup `91"). For information, contact:
-
- (See entry for EWS above - in section dealing with
- Special Interest Groups).
-
-
- 11.3) TEI - The Text Encoding Initiative. An international research project
- to develop and disseminate guidelines for the encoding and interchange of
- machine-readable texts. Primarily concerned with taking existing texts,
- and marking them up with SGML (so as to facilitate later study). The TEI
- has several specialist committees and workgroups e.g. General Linguistics,
- Spoken Texts, Historical Studies, Machine Readable Dictionaries, Computational
- Lexica, Terminological Databases, Character Sets, Text Criticism, Hypertext
- and Hypermedia, Mathematical Formulae and Tables, Language Corpora, Verse,
- Performance Texts, Literary Prose. The TEI maintains two electronic
- discussion lists (tei-l and sgml-l), two archives of related documentation
- (reports, DTDs, and entity sets), and publishes widely. For more information,
- contact one of the co-ordinators:
-
- C. Michael Sperberg-McQueen
- University of Illinois at Chicago
- Computer Center (M/C 135)
- Box 6998
- Chicago IL 60680
- USA
-
- Phone: +1-312-996-2981
- Fax: +1-312-996-6834
- Email: U35395@uicvm.cc.uic.edu (Internet)
- U35395@UICVM (Bitnet)
-
- Lou Burnard
- Oxford University Computing Service
- 13 Banbury Road
- Oxford OX2 6NN
- UK
-
- Phone: +44-865-273-238
- Fax: +44-865-273-275
- Email: LOU@VAX.OX.AC.UK (Internet)
-
-
-
-
- 12) SGML and other Standards
- ============================
-
- Some or all of the Standards mentioned below are described in much more
- depth in Robin Cover's on-line Bibliography (users are advised to consult
- this). Copies of International Standards are available through your national
- standards body. Standards are listed below by acronym/title.
-
- ASCII - (see Character Sets)
-
- Character Sets -
- Being non-proprietary and device-independent, SGML does not restrict users
- to a particular character set. This is a complex area of SGML, and readers
- are directed to ISO 8879, and the ftp archive of postings to comp.text.sgml
- on this subject (at Oslo) for futher information.
-
- DSSSL - Document Style Semantics and Specification Language (ISO/IEC DIS
- 10179:1990).
- The introduction to the Standard says DSSSL can be used "..for the
- specification of document processing such as formatting and data management
- functions, with the initial focus on formatting to both print and on display
- media, and data conversion......The objective of the DSSSL Standard is to
- provide a formal and rigorous means of expressing the range of document
- production specifications, including high-quality typography, required
- by the graphics arts industry." DSSSL is not yet a full International
- Standard.
-
- EBCDIC - (see Character Sets)
-
- Graphics - (see Proprietary/defacto standards)
-
- HyTime - Hypermedia/Time-based Structuring Language (ISO/IEC 10744)
- (Extract from Robin Cover's Bibliography:) "HyTime is a standard neutral
- markup language for representing hypertext, multimedia, hypermedia and
- time- and space-based documents in terms of their logical structure. Its
- purpose is to make hyperdocuments interoperable and maintainable over the
- long term. HyTime can be used to represent documents containing any
- combination of digital notations. HyTime is parsable as Standard Generalized
- Markup Language..." HyTime was accepted as a full International Standard
- in spring 1992.
-
- ODA - Open (or "Office") Document Architecture (ISO 8613)
- This is mentioned because ODA is often presented as if both it and SGML
- existed in opposition to one another. This is not the case. ODA is a
- complex standard, and is undergoing a thorough review; contact your national
- standards body for more information. Papers have been published that
- compare and contrast ODA and SGML (see Robin Cover's on-line Bibliography
- for references). Note: some readers object to postings on ODA being
- sent to comp.text.sgml (try comp.text instead).
-
- Proprietary/defacto standards.
- There is a common misconception that SGML exists in competition to some
- major existing defacto and proprietary standards (such as PostScript
- or TeX/LaTeX). This is not the case, and as you find out more about SGML
- this should become self-evident (however, see SPDL). Note that SGML
- enables the inclusion of data marked up with something other than SGML
- (e.g. TeX, Encapsulated PostScript, a Lotus spreadsheet, CCITT/4,
- CGM, TIFF etc.)
-
- SDIF - SGML Document Interchange Format (ISO 9069:1988)
- "A standard describing the interchange for documents enclosed with SGML"
- (Eric Van Herwijnen, "Practical SGML")
-
- SGML-B - (SGML Binary ?)
- A standard for describing a compiled form of SGML (?). James David Mason
- (Convenor, ISO/IEC JTC1/SC18/WG8), posted the following message to
- comp.text.sgml (15 Aug 1992) "The official status of SGML-B is that it
- is an approved work item in ISO/IEC JTC1/SC18/WG8, the group responsible
- for SGML itself. The editors are Dr. Charles Goldfarb, the SGML project
- leader, and Dr. David Abrahamson, of Trinity College, Dublin. The project
- is being maintained as officially active, with the provision that it will
- not be progressed until the current review and potential revision of SGML
- itself is further along. Our intention is to make SGML-B reflect whatever
- revisions we decide to incorporate into the base standard and then to
- make it a part of the revised standard rather than something independent.
-
- SMDL - Standard Music Description Language (ISO/IEC CD 10743)
- (Extract from Robin Cover's Bibliography:) "..SMDL 'defines a
- language for the representation of music information, either alone,
- or in conjunction with text, graphics, or other information needed for
- publishing or business purposes.' Multimedia time sequence information
- is supported. SMDL is a HyTime application...." SMDL came before, and
- was the source of inspiration for HyTime. Not to be confused with
- SDML ("Standard Digital Markup Language"?) which is a proprietary standard.
-
- SPDL - Standard Page Description Language (ISO/IEC DIS 10190:1991)
- A Standard for mapping to (and possibly from) a description language for
- output devices. Thus an SGML document might go through DSSSL- and SPDL-
- conforming processes before being output on a printer. SPDL might be
- seen to be competing with defacto standards such as PostScript.
-
-
- 13) Introductory questions with answers - by Erik Naggum
- ========================================================
-
- Note: This section includes the bulk of the text posted by Erik Naggum
- to comp.text.sgml as FAQ version 0.0 (Dec. 15 1992). It contains the
- following sections, questions, and answers:
-
- <SECTION>INTRODUCTORY QUESTIONS with answers.
- <Q>What is SGML, briefly?
- <Q>Can I read more about SGML somewhere?
- <Q>SGML is often mentioned as being a "meta-language". What is that?
- <Q>What does an SGML document look like?
- <Q>Can I get the ARC SGML from somewhere electronically?
- <Q>I've received an SGML document from a net.friend, what can I do with it?
- <Q>I'm writing a book, and my publisher wants me to submit an SGML
- document on a diskette, what do I do?
-
- <SECTION>TECHNICAL QUESTIONS with answers
- <Q>What, precisely, is an "element"?
- <Q>What is an "entity" in SGML?
-
- <SECTION>INTRODUCTORY QUESTIONS with answers.
-
- <Q>What is SGML, briefly?
-
- <A>SGML is an abbreviation for the "Standard Generalized Markup
- Language". SGML is defined in an International Standard published by
- the International Organization for Standardization (ISO), with
- reference number ISO 8879:1986, bearing the full name "Information
- processing -- Text and office systems -- Standard Generalized Markup
- Language (SGML)".
-
- To most people, _markup_ means an increase in the price of an article.
- Although we talk about increases in value, it's not the same thing.
- "Markup" is a term coming from the publishing and printing business,
- where it means the instructions for the typesetter that were written
- on a typescript or manuscript copy by an editor. Today, with your
- favorite editor, you can enter the markup yourself, or even have it
- entered for you, in terms of codes or other instructions for an
- electronic typesetting program, which in simple cases is also the
- editor. An example is troff's ".ce" for "center the following line".
-
- A _markup_language_ is a set of means (constructs) to express how text
- (i.e., that which is not markup) should be processed, or handled in
- other ways. Unlike most other artificial languages, markup languages
- have to deal with embedded data, and contain rules for what is markup
- and what is data. For instance, in TeX the backslash means that
- subsequent input is TeX instructions. Most markup languages offer
- additional, administrative, language constructs, with which to define
- other language constructs (such as macros).
-
- _Generalized_markup_ is markup that has the curious property that it
- does _not_ specify how things should look. We still call it markup,
- though, because of the similarity with markup as described above. For
- instance, "<Q>" and "<A>" are used in this FAQ to denote Question and
- Answer, respectively. This doesn't say anything about how questions
- should look in a typeset edition of this FAQ. You could have all the
- questions rendered in bold-face, for instance. With generalized
- markup, you tell the system _what_ you have, rather than how it should
- look, and you do so by putting a label (tag) around the text. There
- is a clear correlation between tags and what things look like. Tags
- are placed at the start and at the end of text or a certain kind, and
- these are precisely the places where typographic features are used,
- such as spacing, change of typeface, etc. An example is LaTeX, which,
- through macros, let you talk about itemized lists, instead of indents,
- item numbering, among other things.
-
- The _Standard_Generalized_Markup_Language_ started out as a large set
- of common tags, but it was soon discovered that this would be far too
- large and still not big enough. So rather than try to outdo Sisyfos
- in pointless and eternal tasks, SGML is a language which makes it
- possible to roll your own generalized markup language, but with a
- standard form and in standard ways. (In practice, you won't exactly
- roll your own, any more than you design LaTeX packages on your own.
- Although some people actually do that!) Central to the design of SGML
- is the idea that a set of generic identifiers (the names of the tags),
- together with their interrelationships, form a type (or class) of
- documents, and that every document is an instance of a class, which
- means it can be validated with respect to this class.
-
- <Q>Can I read more about SGML somewhere?
-
- <A>Let me suggest only one book, and then a bibliography. The book is
- Charles F. Goldfarb: The SGML Handbook; Oxford University Press, 1990;
- ISBN 0-19-853737-9. This book includes the text of the standard, so
- you don't have to worry about finding out how to order it from your
- ISO national member body or directly from ISO in Geneva, or wherever.
- The main feature of this book is that Charles Goldfarb, who is the
- project editor for the standard in ISO's SGML committee, has added a
- tremendous amount of annotations and has provided links between parts
- of the standard to guide your yearning for knowledge. Another big win
- is the overview, which takes you through a guided tour of concepts and
- facilities. If there be only one authority on SGML, this book is it.
- A "paper hypertext" feature makes the links in the text easy to
- follow. This is a book you need.
-
- The bibliography is Robin Cover's Brief Bibliography, also to be
- published on this newsgroup, and it covers the essentials, as well as
- enough pointers to other works to fill a wall of literature. Robin
- Cover, et alia, produced the huge, 312-page "Bibliography on SGML"
- (Tech Report 91-299, Queen's University, Kingston, Ontario, Canada),
- an incredibly useful work. Robin Cover continues to track the SGML
- arena, and hopefully, he will continue to provide us with the fruits
- of his work.
-
- <Q>SGML is often mentioned as being a "meta-language". What is that?
-
- <A>This refers to the fact that SGML isn't only one language, but a
- language which describes other languages within its framework. As we
- talked about classes of documents and every document being an instance
- of such a class, we talk about a class of markup languages, and every
- markup language being an instance of the class. SGML also has the
- necessary expressive power to redefine the particular characters that
- are to be considered markup in a particular markup language, so that
- SGML is really a meta-language with an abstract syntax that each SGML
- document fills in to get a concrete syntax and a particular markup
- language for that document. This is the administrative information
- that makes it possible to talk about "conformance" to SGML.
-
- <Q>What does an SGML document look like?
-
- <A>An SGML document is divided into three different parts, each with a
- clearly defined function.
-
- The first part specifies the character set of the document, which of
- these characters have special meaning to SGML in the rest of the
- document, and which advanced features are used. This is called the
- "SGML declaration", and is like a list of ingredients on food, so you
- know what to expect and what you can't eat. Using this as a check-
- list, you can determine whether your system can handle the document at
- hand. The SGML declaration looks like this:
-
- <!SGML "ISO 8879:1986"
-
- (There are cases where this might be absent. If the document uses all
- the default features, and a concrete syntax defined in the standard as
- the Reference Concrete Syntax, then nothing needs to be said. This is
- chiefly useful within a local system.)
-
- All this talk about an abstract syntax can be a little overwhelming,
- so we'll use the Reference Concrete Syntax unless something is said to
- the contrary.
-
- The second part of an SGML document specifies the document type and
- thereby the tags that can be used in the document, among a host of
- other things. Most often, SGML documents have this part external to
- the document itself, so it doesn't look big. Most users won't see the
- many markup declarations, as they're called, that go into a document
- type, so I'll leave it to the technical part of this FAQ. Anyway,
- this part consists of document type declarations, and they start thus:
-
- <!DOCTYPE
-
- followed by the name of the document type, and its definition. A
- simple definition of this FAQ could point to a file "faq", and then
- the document type declaration would like this:
-
- <!DOCTYPE faq SYSTEM "faq">
-
- (There can be several document types, and a another construct called
- link type declarations (similar to DOCTYPE, but with LINKTYPE).)
-
- The third part of an SGML document is the marked-up "real" document
- which all of the administrative information and legwork makes
- possible. This is called the document instance. It usually begins
- with the name of the document in angle brackets, like this;
-
- <faq>
-
- which is the syntax for a start-tag of an element. The corresponding
- end-tag looks like this:
-
- </faq>
-
- When your parser reads your document, it checks that the tags in the
- document belong to the document type, and that they are allowed where
- they're used, again according to the document type. This process is
- called "validation". When a document is validated, it does not need
- to be so again no matter what your parser is instructed to do with it,
- and no matter which application will use the data in the document.
- This is another strength of SGML: application-independent validation.
-
- <Q>What do you mean "my parser"? Are there any freely available ones?
-
- <A>99% of the fun with SGML can be had only with a parser, so you do
- need one. (The remaining 1% comes from beholding the elegance and
- beauty of the language, and contemplating all the wondrous things you
- can do with it, once you have a parser. This feeling tends not to
- last, unless you're developing a parser, in which case it's almost all
- the fun.) Fortunately, a competent programmer and SGML afficionado
- has had a lot of fun lately, and in mid-July 1991, the ARC SGML parser
- materials were released. The ARC SGML parser materials are legally
- unencumbered (i.e., you can do whatever you want with it) and it's
- available for a nominal cost from the SGML Users' Group, as well as
- from several public SGML repositories.
-
- <Q>Can I get the ARC SGML from somewhere electronically?
-
- <A>The University of Oslo, Department of Informatics, kindly sponsors
- a public FTP archive with material on SGML and has the ARC SGML parser
- available for anonymous FTP. Both the original MS-DOS distribution
- and a Unix port done by James Clark are available. This archive also
- holds information on some standards related to SGML, most notably an
- SGML application for hypermedia documents (the Hypermedia/Time-based
- structuring language, HyTime). Take a look around in the SGML and
- SIGhyper subdirectories. (Anonymous FTP works like this: You need to
- be connected to the Internet, and need a program which can talk the
- FTP protocol, usually something with "FTP" in it. On Unix systems,
- you can say "ftp ftp.ifi.uio.no", and that should be it. You will be
- asked for a user name -- reply "anonymous". You will then be asked
- for a password -- reply with your Internet mail address. You're now
- logged in, and can use the "cd" command to switch directories ("cdup"
- to go one level up), and "ls" to look around. Use "get" to fetch
- files.) If you need guidance, or can't use FTP, you may write to
- <SIGhyper-request@ifi.uio.no>, which I'll try to answer as fast as
- possible. There are also other FAQs available on how to FTP.
-
- <Q>I've received an SGML document from a net.friend, what can I do
- with it?
-
- <A>Didn't your net.friend tell you?? Seriously, an SGML document is,
- as mentioned above, an instance of a document type, and a document
- type can be many things, and it's only part of an application of SGML.
- Such an application consists of several parts: First, there's the
- document type definition, which says which elements you can have, and
- how they interrelate. Second, with the document type definition,
- there's a description of the semantics of the elements, so you know
- what they mean. The description is needed because SGML is not
- concerned with what things mean, only how they are represented. (You
- might complain that this is too small, but it's better to do a given
- task well than to do a greater task badly. There are other standards
- in the great SGML family which take care of these things, and more are
- coming as we witness increased adoption of SGML in the market.)
-
- <Q>I'm writing a book, and my publisher wants me to submit an SGML
- document on a diskette, what do I do?
-
- <A>You take a look at one of the several SGML editing system around,
- and see which you think you would like to write a whole book with.
- Recruit your publisher to help you understand what he wants, and try
- to play with SGML a little before you start writing. SGML is like,
- um, anyway, it gets better with experience, and can be frightening the
- first time. For a good list of starter tools, I again refer you to
- Robin Cover's brief bibliography for the details.
-
-
- <SECTION>TECHNICAL QUESTIONS with answers
-
- <Q>What, precisely, is an "element"?
-
- <A>An element is the smallest part of a document that SGML deals with,
- and it's the basic building block of document types. An element may
- contain data (text), subelements, both, or it may be empty. The task
- of a document type designer is to identify the elements a document is
- to consist of, and define a hierarchical structure of these elements
- by means of other elements. An element definition consists of the
- name (generic identifier) which will be used in tags, a description of
- the content (using a "content model"), and an indication of whether
- the start-tags or the end-tags may be omitted. An element (in the
- document instance) is indicated by a start-tag, the contents, and an
- end-tag.
-
- An element, with its notion of content models, provide a powerful
- abstraction over the different kinds of text that can be found in a
- document. For instance, ordinary text is just characters that will be
- formatted somehow on output. If you have special kinds of text, such
- as, for instance, a telephone number, it could make sense (depending
- on your application) to make a special element with generic identifer
- "phone". That way, you can look for telephone numbers and get matches
- only at the right places. If you're really far-sighted, you would
- define a telephone number notation you associated with this element,
- so that you could check that all your phone numbers had the right
- format. Then you could modify the presentation of a phone number to
- suit a particular need, e.g. <phone>+1 516 555 8879</phone> in the
- document could come out as "(516) 555 8879" in a domestic catalog and
- with full, international format for an international catalog.
-
- In a way, elements are like concepts, where a concept (say, "beef") is
- an abstraction over an innumerable lot of things into a particular
- "type" of thing, all having common characteristics, and fits into a
- hierarchy where concepts may be abstractions over other concepts.
- This idea of "types" and of a conceptual tool for text is one of the
- many great things with SGML. A content model is like the definition
- of a concept, with the important difference that a content model is
- defined in terms of the behavior its subelements. A subelement may be
- optional, required, or repeatable, and subelements may be chosen from
- a set, form an ordered set, or form an unordered set. Then there are
- exceptional subelements, which may either be forbidden or allowed
- anywhere in the contents of the element.
-
- The similarity between element and concepts go further, as elements
- may have attributes. An attribute is information about an element
- which is not part of its content.
-
- The element in SGML is thus a high abstraction over identifiable,
- separate portions of contents of a document from a conceptual and
- hierarchical view.
-
- <Q>What is an "entity" in SGML?
-
- <A>The notion of an entity is SGML is an even higher abstraction than
- the element, and since this is somewhat unexpected to most readers of
- SGML, it's probably the reason why so many have problems with it.
-
- The concept of an element comes from looking at the contents of a
- document and grasping that the contents forms an element structure, a
- hierarchy of elements, and that the nature of each element can be
- abstracted so that a content model can be defined which spans the
- varied use of each subelement.
-
- The concept of an entity comes from looking at the individual pieces of text
- that make up a whole document, and realizing that these pieces are
- independent of the element structure. E.g., a book may physically consist
- of several files on the author's disks. The element structure of the book
- spans all the disks and all the files, yet it's important to be able to
- refer to the files. The both complicating and relieving aspect of this is
- that we need to be able to refer to these pieces in a system- and
- storage-independent way. This is where the entity saves us a lot of
- trouble. Entities are named pieces of text.
-
- The abstraction that causes some confusion is over what a "piece of
- text" is, and, in particular, where it is found. We have looked at
- external entities, that is, entities which, when we refer to them,
- cause us to read a different file. We may also need to define short-
- hand notations for things in a document without needing an external
- file for every small piece of text. This means that entities have
- types, as well. There are internal entities, entities that are useful
- as short-hands for language constructs, entities that are text which
- is not to be interpreted, etc, and external entities, entities that
- are simply text, entities that are in a special notation, to be
- interpreted by a special program, perhaps with parameters, entities
- which constitute larger parts of the administrative functions of the
- first and second part of the SGML document.
-
- Moreover, entities may be used both by the administrative parts and
- the user, and the user shouldn't have to worry about which entities
- are used by the administrative functions he doesn't see. So, entities
- come in two flavors, parameter entities and general entities.
-
- An "entity", then, is an abstraction over several types of text that
- you want to refer to by name. Once defined, you don't need to know
- where it is found, or of what kind it is -- all (general) entities
- look and feel the same to the user.
-
-
- 14) Making comments/additions to this FAQ
- =========================================
-
- If you have any additional information, comments or questions, please
- email either of the authors of this FAQ. (Producers and suppliers of
- commercial software should note that we can only provide _very_ brief
- details of their product). If you wish to include some information,
- please indicate _clearly_ in your message where you think it should
- go in the FAQ (ie. what section etc).
-
- Email: Erik Naggum <enag@no.uio.ifi>
- Michael Popham <M.G.Popham@exeter.ac.uk>
-
- Post: Erik Naggum, Naggum Software, Boks 1570, Vika, 0118 OSLO, Norway;
- Michael Popham, The SGML Project, Computer Unit, University of Exeter,
- Exeter EX4 4QE, UK.
-
- Fax: (Michael Popham) +44-392-211630
-
-
-
- =========================================================================
- COPYRIGHT NOTICE - Users are free to distribute this information in
- any form, PROVIDED THAT no charge (other than to cover reproduction
- costs) is made, the authors are acknowledged, and the final section on
- "Making comments/additions to this FAQ" and this notice are included
- in all copies.
- =========================================================================
-